345 research outputs found

    Deep Classifier Mimicry without Data Access

    Full text link
    Access to pre-trained models has recently emerged as a standard across numerous machine learning domains. Unfortunately, access to the original data the models were trained on may not equally be granted. This makes it tremendously challenging to fine-tune, compress models, adapt continually, or to do any other type of data-driven update. We posit that original data access may however not be required. Specifically, we propose Contrastive Abductive Knowledge Extraction (CAKE), a model-agnostic knowledge distillation procedure that mimics deep classifiers without access to the original data. To this end, CAKE generates pairs of noisy synthetic samples and diffuses them contrastively toward a model's decision boundary. We empirically corroborate CAKE's effectiveness using several benchmark datasets and various architectural choices, paving the way for broad application.Comment: 10 pages main, 4 figures, 2 tables, 2 pages appendi

    Self Expanding Neural Networks

    Full text link
    The results of training a neural network are heavily dependent on the architecture chosen; and even a modification of only the size of the network, however small, typically involves restarting the training process. In contrast to this, we begin training with a small architecture, only increase its capacity as necessary for the problem, and avoid interfering with previous optimization while doing so. We thereby introduce a natural gradient based approach which intuitively expands both the width and depth of a neural network when this is likely to substantially reduce the hypothetical converged training loss. We prove an upper bound on the "rate" at which neurons are added, and a computationally cheap lower bound on the expansion score. We illustrate the benefits of such Self-Expanding Neural Networks in both classification and regression problems, including those where the appropriate architecture size is substantially uncertain a priori.Comment: 10 pages, 4 figure

    Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data

    Full text link
    Backdoor attacks pose a serious security threat for training neural networks as they surreptitiously introduce hidden functionalities into a model. Such backdoors remain silent during inference on clean inputs, evading detection due to inconspicuous behavior. However, once a specific trigger pattern appears in the input data, the backdoor activates, causing the model to execute its concealed function. Detecting such poisoned samples within vast datasets is virtually impossible through manual inspection. To address this challenge, we propose a novel approach that enables model training on potentially poisoned datasets by utilizing the power of recent diffusion models. Specifically, we create synthetic variations of all training samples, leveraging the inherent resilience of diffusion models to potential trigger patterns in the data. By combining this generative approach with knowledge distillation, we produce student models that maintain their general performance on the task while exhibiting robust resistance to backdoor triggers.Comment: 11 pages, 3 tables, 2 figure

    FEATHERS: Federated Architecture and Hyperparameter Search

    Full text link
    Deep neural architectures have profound impact on achieved performance in many of today's AI tasks, yet, their design still heavily relies on human prior knowledge and experience. Neural architecture search (NAS) together with hyperparameter optimization (HO) helps to reduce this dependence. However, state of the art NAS and HO rapidly become infeasible with increasing amount of data being stored in a distributed fashion, typically violating data privacy regulations such as GDPR and CCPA. As a remedy, we introduce FEATHERS - FE\textbf{FE}derated A\textbf{A}rchiT\textbf{T}ecture and H\textbf{H}ypER\textbf{ER}parameter S\textbf{S}earch, a method that not only optimizes both neural architectures and optimization-related hyperparameters jointly in distributed data settings, but further adheres to data privacy through the use of differential privacy (DP). We show that FEATHERS efficiently optimizes architectural and optimization-related hyperparameters alike, while demonstrating convergence on classification tasks at no detriment to model performance when complying with privacy constraints.Comment: Main paper: 8 pages, References: 2 pages, Supplement: 4.5 pages, Main paper: 3 figures, 2 tables, 1 algorithm, Supplement: 2 figure, 4 algorithms, extended previous version by Differential Privacy, theoretical results and more experiments. Updated author list as it was incomplet

    Monatomic phase change memory

    Full text link
    Phase change memory has been developed into a mature technology capable of storing information in a fast and non-volatile way, with potential for neuromorphic computing applications. However, its future impact in electronics depends crucially on how the materials at the core of this technology adapt to the requirements arising from continued scaling towards higher device densities. A common strategy to finetune the properties of phase change memory materials, reaching reasonable thermal stability in optical data storage, relies on mixing precise amounts of different dopants, resulting often in quaternary or even more complicated compounds. Here we show how the simplest material imaginable, a single element (in this case, antimony), can become a valid alternative when confined in extremely small volumes. This compositional simplification eliminates problems related to unwanted deviations from the optimized stoichiometry in the switching volume, which become increasingly pressing when devices are aggressively miniaturized. Removing compositional optimization issues may allow one to capitalize on nanosize effects in information storage
    • …
    corecore